Load required package:
library(marmap)
Import bathymetry data from NOAA:
b = getNOAA.bathy(lon1 = 115, lon2 = 140, lat1 = 35, lat2 = 20, resolution = 1)
## Querying NOAA database ...
## This may take seconds to minutes, depending on grid size
## Building bathy matrix ...
Import dataframe with the latitude and longitude of sampling points (in decimal degrees):
pts <- read.csv("~/Desktop/MiraiFilters/SamplingLatLong.csv")
pts$Station<- as.character(pts$Station)
str(pts)
## 'data.frame': 14 obs. of 3 variables:
## $ Lon : num 126 128 127 126 125 ...
## $ Lat : num 26.3 25.4 25.9 26.5 26 ...
## $ Station: chr "2" "3" "4" "5" ...
Make dataframe with map labels and their lat/long:
Clon = c(127.5,131.25, 121.6, 121.03, 120.8, 126)
Clat= c(26, 31.93, 31.1, 23, 27.2, 22)
CLabels= c("Okinawa", "Japan", "China", "Taiwan", "East China Sea", "Philippine Sea")
Countries = cbind.data.frame(Clon, Clat, CLabels)
str(Countries)
## 'data.frame': 6 obs. of 3 variables:
## $ Clon : num 128 131 122 121 121 ...
## $ Clat : num 26 31.9 31.1 23 27.2 ...
## $ CLabels: Factor w/ 6 levels "China","East China Sea",..: 4 3 1 6 2 5
Define colors to use in plot:
blues <- c("lightsteelblue4", "lightsteelblue3", "lightsteelblue2", "lightsteelblue1")
greys <- c(grey(0.6), grey(0.93), grey(0.99))
Make plot one layer at a time:
#tiff("MiraiFiltersMap.tiff", height= 8, width =8, units = 'in', res=300) # uncomment to save plot to file
plot(b, image = TRUE, land = TRUE, xlim=c(120,135), ylim=c(22, 32), lwd = 0.03, bpal = list(c(0, max(b), greys), c(min(b),0 , blues)), cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5) #bathymetry colors
plot(b, lwd = 0.8, deep = 0, shallow = 0, step = 0, add = TRUE) # highlight coastline with solid black line
plot(b, deep=-200, shallow=-200, lwd = 0.4, drawlabels=T, add=T) # Add -200m isobath
plot(b, deep=-1000, shallow=-1000, lwd = 0.4, drawlabels=T, add=T) # Add -1000m isobath
plot(b, deep=-2000, shallow=-2000, lwd = 0.4, drawlabels=T, add=T) # Add -2000m isobath
plot(b, deep=-3000, shallow=-3000, lwd = 0.4, drawlabels=T, add=T) # Add -3000m isobath
points(pts, pch = 16, col = 'red') # Add sampling points
text(pts, labels=pts$Station, adj= 1.5, cex=1.2) # Add sampling station numbers
text(Countries, labels=Countries$CLabels, cex= 1.7, adj = -0.2) # Add map labels
#dev.off() #uncomment to save plot to file
Sample collection, DNA extraction, PCR, & sequencing
Seawater collected from the Deep Chlorophyll Maximum (DCM), mid-water column, and near-bottom was collected in 10 L Niskin bottles at each sampling station and surface seawater was collected by bucket. Five Liter replicates from each depth were sequentially filtered through 10.0 and 0.2 um pore-size Polytetrafluoroethylene (PTFE) filters. Filters were flash frozen in liquid nitrogen and stored at -80 C until DNA extraction. DNA was extracted following the manufacturers protocols for the Qiagen/MoBio DNeasy PowerWater Kit, including the optional heating step to lyse cells. Amplicon PCR and sequencing library preparation followed the procedures described in the Illumina 16S Metagenomic Sequencing Library Preparation manual modified only to include universal eukaryotic primers for the v4 region of the 18S rRNA gene (F: Stoeck et al. 2010, R: Brisbin et al. 2018) and amplicon PCR conditions most appropriate for these primers. The 220 samples were radomly assigned to 4 sequencing runs on the Illumina MiSeq sequencing platform with v3 chemistry for 300x300-bp paired-end sequencing.
Bioinformatic processing with Qiime2
Demultiplexed paired-end sequences provided by the OIST sequencing center were imported to Qiime2 (v2018.11) software (Bolyen et al. 2018). The Divisive Amplicon Denoising Algorithm (DADA) was implemented with the DADA2 plug-in for Qiime2 for quality filtering, chimera removal, and feature table construction (Callahan et al. 2015) separately for each sequencing run before merging the four resulting feature tables. Taxonomy was assigned to Amplicon Sequence Variants (ASVs) in the feature table by training a classifier on the Protist Ribosomal Reference (PR2) database to classify ASVs (Guillo et al. 2012). The feature table and assigned taxonomy were then exported from Qiime2 for further analysis.
Below is an example Qiime2 workflow for 1 sequencing run:
Activate an anaconda environment for qiime2: source activate qiime2-2018.11
Import sequencing data (all sequences for eacg sequencing run must be in their own directory): qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ./seqs/seqrun1/ --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path run1_demux-paired-end.qza
Visualize squencing results (use these graphs to decide what length to trim sequences) qiime demux summarize --i-data run1_demux-paired-end.qza --o-visualization run1_demux.qzv
Run the DADA2 algorithm: qiime dada2 denoise-paired --i-demultiplexed-seqs run1_demux-paired-end.qza --output-dir ./dd2run1 --o-representative-sequences run1_rep-seqs --p-trim-left-f 10 --p-trim-left-r 10 --p-trunc-len-f 260 --p-trunc-len-r 250 --p-n-threads 3
Check results: qiime feature-table summarize --i-table ./dd2run1/table.qza --o-visualization ./dd2run1/table.qzv --m-sample-metadata-file ~/desktop/MiraiFilters/sampleMaptxt.txt
qiime metadata tabulate --m-input-file dd2run1/denoising_stats.qza --o-visualization dd2run1/denoising_stats.qzv
Merge feature tables from multiple sequencing runs: qiime feature-table merge --i-tables ./dd2run1/table.qza --i-tables ./dd2run2/table.qza --i-tables ./dd2run3/table.qza --i-tables ./dd2run4/table.qza --o-merged-table mergedtable.qza
qiime feature-table merge-seqs --i-data run1_rep-seqs.qza --i-data run2_rep-seqs.qza --i-data run3_rep-seqs.qza --i-data run4_rep-seqs.qza --o-merged-data merged-rep-seqs.qza
Import taxonomy database: qiime tools import --type 'FeatureData[Sequence]' --input-path pr2_version_4.11.1_mothur.fasta --output-path pr2.qza
qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path pr2_version_4.11.1_mothur.tax -output-path pr2_tax.qza
Select v4 region from PR2 sequences: qiime feature-classifier extract-reads --i-sequences pr2.qza --p-f-primer CCAGCASCYGCGGTAATTCC --p-r-primer ACTTTCGTTCTTGATYRA --o-reads pr2_v4.qza
Train the classifier: qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads pr2_v4.qza --i-reference-taxonomy pr2_tax.qza --o-classifier pr2_v4_classifier.qza
Classify: qiime feature-classifier classify-sklearn --i-classifier pr2_v4_classifier.qza --i-reads merged-rep-seqs.qza --o-classification taxonomy.qza
qiime metadata tabulate --m-input-file taxonomy.qza --o-visualization taxonomy.qzv
Export the .tsv taxonomy file from taxonomy.qzv for use in following steps.
Load packages:
library(tidyverse)
#devtools::install_github("jbisanz/qiime2R")
library(qiime2R)
library('phyloseq')
library('vegan')
library('ggplot2')
library(RColorBrewer)
library(DESeq2)
library(reshape)
library("wesanderson")
library("gridExtra")
Set default colors:
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
P <- brewer.pal(12, "Paired")
S2 <- brewer.pal(8, "Set2")
S1 <- brewer.pal(8, "Set1")
D2 <- brewer.pal(8, "Dark2")
colors<- c(P, S2, S1,D2)
colors2 <- rep(colors, 6)
Convert Qiime2 artifacts to phyloseq objects:
phyloseq<-qza_to_phyloseq(features="mergedtable.qza")
## Warning in read_qza(features): Artifact was not generated with Qiime2
## 2018.4, if data is not successfully imported, please report here
## github.com/jbisanz/qiime2R/issues
Import sample data:
metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$Station <- as.character(metatable$Station)
metatable$filter <- as.character(metatable$filter)
META<- sample_data(metatable)
str(META)
## 'data.frame': 219 obs. of 5 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 5
## .. ..$ : Factor w/ 219 levels "A6-F173-DNA",..: 3 6 9 12 96 97 115 118 129 134 ...
## .. ..$ : chr "2" "2" "2" "2" ...
## .. ..$ : Factor w/ 4 levels "Bottom","Mid",..: 4 4 4 4 1 1 2 2 3 3 ...
## .. ..$ : int 0 0 0 0 1000 1000 700 700 92 92 ...
## .. ..$ : chr "10" "2" "10" "2" ...
## ..@ names : chr "SampleID" "Station" "Depth" "m" ...
## ..@ row.names: chr "A8-F17-DNA" "B8-F18-DNA" "C8-F19-DNA" "D8-F20-DNA" ...
## ..@ .S3Class : chr "data.frame"
Load PR2 taxonomy:
taxonomy <- read.csv("taxonomy.csv", stringsAsFactors = FALSE)
names(taxonomy) <- c("row", "tax", "Confidence")
row.names(taxonomy) <-taxonomy[[1]]
taxonomy <- taxonomy[,(-1)]
taxonomy <- separate(taxonomy, tax, c("D0","D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "D10", "D11", "D12", "D13", "D14"), sep = ";", fill = "right")
taxonomy <- taxonomy[,c(1:8)]
taxmat <- as.matrix(taxonomy)
TAX = tax_table(taxmat)
str(TAX)
## Formal class 'taxonomyTable' [package "phyloseq"] with 1 slot
## ..@ .Data: chr [1:23115, 1:8] "Eukaryota" "Eukaryota" "Eukaryota" "Eukaryota" ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:23115] "000a220fe8edd771dd82f6915627ef3e" "000c79760dd16d6692018deb8ea1b164" "000d9eaa21eaff08f34fb1a1a87601d5" "000e6f109e01181b2b1eda5d6c99319c" ...
## .. .. ..$ : chr [1:8] "D0" "D1" "D2" "D3" ...
Add taxonomy to phyloseq object:
ps = merge_phyloseq(phyloseq, TAX, META)
Convert ASV counts to relative abundance:
physeqRA<- transform_sample_counts(ps, function(OTU) 100* OTU/sum(OTU))
Principal Coordinate Analysis (PCoA) for Bray-Curtis distances between all samples:
ordu = ordinate(physeqRA, "PCoA", "bray")
p<-plot_ordination(physeqRA, ordu, color="Depth", shape="filter", label="SampleID")+theme_bw() +scale_color_manual(values=cbPalette)+ geom_point(size=2)
p
Remove mislabeled samples:
mixedup <- c("F43", "F44", "F45", "F46", "F47", "F48", "F84")
'%ni%' <- Negate('%in%')
ps<- subset_samples(ps, SampleID %ni% mixedup)
physeqRA<- transform_sample_counts(ps, function(OTU) 100* OTU/sum(OTU))
Bray-Curtis PCoA plot:
Dj1<- wes_palette("Darjeeling1")
pcoacolor<- c("#03396c", "#005b96", "#93DB70", Dj1[3])
ordu = ordinate(physeqRA, "PCoA", "bray")
p<-plot_ordination(physeqRA, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=pcoacolor)+ geom_point(size=2) +theme(text = element_text(size=24))
p2<-p + theme(legend.position="none")
p2
There is clustering by sample depth: DCM, Surface, and Mid/Bottom. In the DCM and Surface, there is also clustering by filter type.
ggsave("p2.pdf", width = 6, height = 5 )
Plot legend:
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
legend <- g_legend(p)
plegend <- grid.arrange(legend)
The strongest determiner of clustering seems to be depth. To better observe clustering within depth groups, subset by depth and then filter type.
physeqSurf<- subset_samples(physeqRA, Depth == "Surface")
ordu = ordinate(physeqSurf, "PCoA", "bray")
p<-plot_ordination(physeqSurf, ordu, color="Station", shape="filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3) +ggtitle("Surface - Bray Curtis")
p
Strong separation by filter type
physeqSurf10<- subset_samples(physeqRA, Depth == "Surface" & filter == "10")
ordu = ordinate(physeqSurf10, "PCoA", "bray")
p<-plot_ordination(physeqSurf10, ordu, color="Station")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Replicates look good; stations 3 and 15 are the only replicates whose nearest neighbor is not from the same station.
physeqSurf2<- subset_samples(physeqRA, Depth == "Surface" & filter == "2")
ordu = ordinate(physeqSurf2, "PCoA", "bray")
p<-plot_ordination(physeqSurf2, ordu, color="Station")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Replicates are better in the 0.2 um filter samples than the 10.0 um filters.
physeqSCM<- subset_samples(physeqRA, Depth == "SCM")
ordu = ordinate(physeqSCM, "PCoA", "bray")
p<-plot_ordination(physeqSCM, ordu, color="Station", shape="filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Again, strong clustering by filter type.
physeqSCM10<- subset_samples(physeqRA, Depth == "SCM" & filter == "10")
ordu = ordinate(physeqSCM10, "PCoA", "bray")
p<-plot_ordination(physeqSCM10, ordu, color="Station")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Replicates look good.
physeqSCM2<- subset_samples(physeqRA, Depth == "SCM" & filter == "2")
ordu = ordinate(physeqSCM2, "PCoA", "bray")
p<-plot_ordination(physeqSCM2, ordu, color="Station")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Replicates look good.
physeqMB<- subset_samples(physeqRA, Depth == "Mid")
ordu = ordinate(physeqMB, "PCoA", "bray")
p<-plot_ordination(physeqMB, ordu, color="Station", shape="filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Clustering is by station rather than by filter type in the Mid and Bottom depths. Geographic location is a stronger determinant than filter size.
physeqB<- subset_samples(physeqRA, Depth == "Bottom")
ordu = ordinate(physeqB, "PCoA", "bray")
p<-plot_ordination(physeqB, ordu, color="Station", shape="filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Clustering is by station rather than by filter type.
Prepare metadata table:
metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$desc <- paste(metatable$Station, metatable$Depth, metatable$filter, sep="-")
metatable <- metatable[metatable$SampleID %ni% mixedup,]
metatable$Depth <- as.character(metatable$Depth)
metatable$Station <-as.character(metatable$Station)
metatable$filter<-as.character(metatable$filter)
metatable<- metatable[,-which(names(metatable) == "SampleID")]
META<- sample_data(metatable)
ps<- subset_samples(ps, SampleID %ni% mixedup)
ps2<- ps
sample_data(ps2) <- META
print(ps2)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 23115 taxa and 211 samples ]
## sample_data() Sample Data: [ 211 samples by 5 sample variables ]
## tax_table() Taxonomy Table: [ 23115 taxa by 8 taxonomic ranks ]
str(sample_data(ps2))
## 'data.frame': 211 obs. of 5 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 5
## .. ..$ : chr "11" "10" "10" "10" ...
## .. ..$ : chr "Mid" "Bottom" "Mid" "SCM" ...
## .. ..$ : int 700 1510 700 85 85 85 700 700 1510 1510 ...
## .. ..$ : chr "2" "2" "10" "2" ...
## .. ..$ : chr "11-Mid-2" "10-Bottom-2" "10-Mid-10" "10-SCM-2" ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "F142" "F150" "F151" "F154" ...
## ..@ .S3Class : chr "data.frame"
Merge:
mergedps <- merge_samples(ps2, "desc")
## Warning in asMethod(object): NAs introduced by coercion
## Warning in asMethod(object): NAs introduced by coercion
print(mergedps)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 23115 taxa and 112 samples ]
## sample_data() Sample Data: [ 112 samples by 5 sample variables ]
## tax_table() Taxonomy Table: [ 23115 taxa by 8 taxonomic ranks ]
number of samples reduced from 211 to 112
sample_names(mergedps)
## [1] "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2"
## [5] "10-SCM-10" "10-SCM-2" "10-Surface-10" "10-Surface-2"
## [9] "11-Bottom-10" "11-Bottom-2" "11-Mid-10" "11-Mid-2"
## [13] "11-SCM-10" "11-SCM-2" "11-Surface-10" "11-Surface-2"
## [17] "12-Bottom-10" "12-Bottom-2" "12-Mid-10" "12-Mid-2"
## [21] "12-SCM-10" "12-SCM-2" "12-Surface-10" "12-Surface-2"
## [25] "13-Bottom-10" "13-Bottom-2" "13-Mid-10" "13-Mid-2"
## [29] "13-SCM-10" "13-SCM-2" "13-Surface-10" "13-Surface-2"
## [33] "14-Bottom-10" "14-Bottom-2" "14-Mid-10" "14-Mid-2"
## [37] "14-SCM-10" "14-SCM-2" "14-Surface-10" "14-Surface-2"
## [41] "15-Bottom-10" "15-Bottom-2" "15-Mid-10" "15-Mid-2"
## [45] "15-SCM-10" "15-SCM-2" "15-Surface-10" "15-Surface-2"
## [49] "17-Bottom-10" "17-Bottom-2" "17-Mid-10" "17-Mid-2"
## [53] "17-SCM-10" "17-SCM-2" "17-Surface-10" "17-Surface-2"
## [57] "18-Bottom-10" "18-Bottom-2" "18-Mid-10" "18-Mid-2"
## [61] "18-SCM-10" "18-SCM-2" "18-Surface-10" "18-Surface-2"
## [65] "2-Bottom-10" "2-Bottom-2" "2-Mid-10" "2-Mid-2"
## [69] "2-SCM-10" "2-SCM-2" "2-Surface-10" "2-Surface-2"
## [73] "3-Bottom-10" "3-Bottom-2" "3-Mid-10" "3-Mid-2"
## [77] "3-SCM-10" "3-SCM-2" "3-Surface-10" "3-Surface-2"
## [81] "4-Bottom-10" "4-Bottom-2" "4-Mid-10" "4-Mid-2"
## [85] "4-SCM-10" "4-SCM-2" "4-Surface-10" "4-Surface-2"
## [89] "5-Bottom-10" "5-Bottom-2" "5-Mid-10" "5-Mid-2"
## [93] "5-SCM-10" "5-SCM-2" "5-Surface-10" "5-Surface-2"
## [97] "8-Bottom-10" "8-Bottom-2" "8-Mid-10" "8-Mid-2"
## [101] "8-SCM-10" "8-SCM-2" "8-Surface-10" "8-Surface-2"
## [105] "9-Bottom-10" "9-Bottom-2" "9-Mid-10" "9-Mid-2"
## [109] "9-SCM-10" "9-SCM-2" "9-Surface-10" "9-Surface-2"
But metatable inside phyloseq object was disturbed …
str(sample_data(mergedps))
## 'data.frame': 112 obs. of 5 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 5
## .. ..$ : num 10 10 10 10 10 10 10 10 11 11 ...
## .. ..$ : num NA NA NA NA NA NA NA NA NA NA ...
## .. ..$ : num 1510 1510 700 700 85 85 0 0 1210 1210 ...
## .. ..$ : num 10 2 10 2 10 2 10 2 10 2 ...
## .. ..$ : num NA NA NA NA NA NA NA NA NA NA ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ .S3Class : chr "data.frame"
Fix meta table in phyloseq object:
meta2<-as.data.frame(sample_data(mergedps))
split<- do.call(rbind, strsplit(row.names(meta2), "-"))
meta2$Depth <- split[,2]
meta2$filter<-split[,3]
meta2$desc <- row.names(meta2)
meta2$Station <- as.character(meta2$Station)
META <-sample_data(meta2)
sample_data(mergedps)<-META
Fixed!
str(sample_data(mergedps))
## 'data.frame': 112 obs. of 5 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 5
## .. ..$ : chr "10" "10" "10" "10" ...
## .. ..$ : chr "Bottom" "Bottom" "Mid" "Mid" ...
## .. ..$ : num 1510 1510 700 700 85 85 0 0 1210 1210 ...
## .. ..$ : chr "10" "2" "10" "2" ...
## .. ..$ : chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ .S3Class : chr "data.frame"
Create phyloseq object with acantharian ASVs only:
phA<-subset_taxa(mergedps, D3 =="Acantharea")
phAra<- transform_sample_counts(phA, function(OTU) 100* OTU/sum(OTU))
print(phAra)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1079 taxa and 112 samples ]
## sample_data() Sample Data: [ 112 samples by 5 sample variables ]
## tax_table() Taxonomy Table: [ 1079 taxa by 8 taxonomic ranks ]
ordu = ordinate(phAra, "PCoA", "bray")
p<-plot_ordination(phAra, ordu, color ="Depth", shape = "filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
Clustering by Depth, but different pattern than the whole community; now SCM and Surface are very similar and Mid and Bottom are more divergent. There is no longer clear separation between filter size in the DCM and Surface samples.
Colored by depth:
phAraS<- subset_samples(phAra, Depth == "SCM" | Depth == "Surface")
ordu = ordinate(phAraS, "PCoA", "bray")
p<-plot_ordination(phAraS, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=cbPalette)+ geom_point(size=3)
p
Clustering by depth, not by filter type.
Colored by station:
phAraS<- subset_samples(phAra, Depth == "SCM" | Depth == "Surface")
ordu = ordinate(phAraS, "PCoA", "bray")
p<-plot_ordination(phAraS, ordu, color="Station", shape="filter")+theme_bw() +scale_color_manual(values=colors)+ geom_point(size=3)
p
No clearly apparent geographic patterns.
Dj<- wes_palette("Darjeeling2")
Cv <- wes_palette("Cavalcanti1")
Gb<- wes_palette("GrandBudapest1")
wpcolor<- c(Dj[2], Dj[4], Cv[2], Cv[4], Gb[2], Cv[5])
taxabarplot<-plot_bar(phAra, fill = "D4") + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=wpcolor) + theme(legend.title=element_blank()) + geom_bar(aes( fill=D4), stat="identity", position="stack") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
t2<-taxabarplot + theme(legend.position="bottom")
t2
ggsave("t2.pdf", width = 12, height =7)
The relative abundance plot really clearly shows that the filters for each sample are very similar at all four water depths. This is different than the pattern seen for the full community, where the samples clustered by filter type in the surface and SCM. The surface samples are dominated by photosymbiotic acantharian sequences (Arth/Symph). The DCM sees the addition of Acantharea-Group-II and Group VI and slightly more Chaunacanthida. Mid and Bottom samples are dominated by Group-I, but still has the others as well, including photosymbiotic clades.
GP.ord <- ordinate(phAra, "NMDS", "bray")
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1188792
## Run 1 stress 0.1168515
## ... New best solution
## ... Procrustes: rmse 0.01060792 max resid 0.1095591
## Run 2 stress 0.129397
## Run 3 stress 0.1270149
## Run 4 stress 0.1436838
## Run 5 stress 0.1239147
## Run 6 stress 0.129396
## Run 7 stress 0.1397795
## Run 8 stress 0.1281955
## Run 9 stress 0.1352331
## Run 10 stress 0.1303261
## Run 11 stress 0.1397795
## Run 12 stress 0.1234518
## Run 13 stress 0.1281028
## Run 14 stress 0.1234518
## Run 15 stress 0.1222387
## Run 16 stress 0.1234595
## Run 17 stress 0.1245755
## Run 18 stress 0.1270611
## Run 19 stress 0.1281913
## Run 20 stress 0.1443523
## *** No convergence -- monoMDS stopping criteria:
## 18: stress ratio > sratmax
## 2: scale factor of the gradient < sfgrmin
p1 = plot_ordination(phAra, GP.ord, type="taxa", color="D4", title="taxa") +scale_color_manual(values=colors)
print(p1)
p1 + facet_wrap(~D4, 3)
Acantharea-group-1 is the only group with a clear pattern: primarily present in deep/mid cluster and absent in the surface.
p4 = plot_ordination(phAra, GP.ord, type="split", color="Depth", shape="filter", label="Station", title="split")
p4
ps17 <- subset_samples(mergedps, Station == "17")
table(tax_table(ps17)[, "D2"], exclude = NULL)
##
## Alveolata_X Amoebozoa_X Apicomplexa
## 37 1 51
## Apusomonadidae Centroheliozoa Cercozoa
## 8 49 474
## Chlorophyta Choanoflagellida Ciliophora
## 210 113 1376
## Conosa Cryptophyta Dinoflagellata
## 2 43 10967
## Discoba Eukaryota_XX Foraminifera
## 6 1 4
## Fungi Haptophyta Hilomonadea
## 221 560 3
## Katablepharidophyta Lobosa Mesomycetozoa
## 9 12 31
## Metamonada Metazoa Ochrophyta
## 1 1491 828
## Opisthokonta_X Perkinsea Picozoa
## 25 91 124
## Radiolaria Rhodophyta Stramenopiles_X
## 3335 9 1196
## Streptophyta Telonemia <NA>
## 46 90 1701
ps17<- subset_taxa(ps17, D1 != "")
dim(table(tax_table(ps17)[, "D2"], exclude = NULL))
## [1] 33
sample_data(ps17)
## Station Depth m filter desc
## 17-Bottom-10 17 Bottom 770 10 17-Bottom-10
## 17-Bottom-2 17 Bottom 770 2 17-Bottom-2
## 17-Mid-10 17 Mid 700 10 17-Mid-10
## 17-Mid-2 17 Mid 700 2 17-Mid-2
## 17-SCM-10 17 SCM 100 10 17-SCM-10
## 17-SCM-2 17 SCM 100 2 17-SCM-2
## 17-Surface-10 17 Surface 0 10 17-Surface-10
## 17-Surface-2 17 Surface 0 2 17-Surface-2
ps17 = tax_glom(ps17, "D3")
ps17ra <- transform_sample_counts(ps17, function(OTU) 100* OTU/sum(OTU))
ps17ra<-subset_samples(ps17ra, filter == "10")
taxabarplot<-plot_bar(ps17ra, fill = "D3") + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=colors2) + theme(legend.title=element_blank()) + geom_bar(aes( fill=D3), stat="identity", position="stack") +theme_bw()
taxabarplot + theme(legend.position="none") + coord_flip()
Need to glom all taxa to D2, but keep acantharea separate (to create a clean plot)
ps17<- subset_taxa(ps17, D2 != "Metazoa")
ps17noA <- subset_taxa(ps17, D3 !="Acantharea")
ps17Aonly<- subset_taxa(ps17, D3 =="Acantharea")
ps17noA = tax_glom(ps17noA, "D2")
taxx <- as.data.frame(phyloseq::tax_table(ps17noA), stringsAsFactors = F)
taxx$D3 <- taxx$D2
taxxA <- as.data.frame(phyloseq::tax_table(ps17Aonly), stringsAsFactors = F)
taxx2<- rbind(taxx, taxxA)
otus <- as.data.frame(phyloseq::otu_table(ps17noA), stringsAsFactors = F)
otusA <- as.data.frame(phyloseq::otu_table(ps17Aonly), stringsAsFactors = F)
otusF <- cbind(otus, otusA)
OTUFinal <-otu_table(otusF, taxa_are_rows=FALSE)
taxxm <- as.matrix(taxx2)
TAXA <- tax_table(taxxm)
finalps<- phyloseq(OTUFinal, TAXA, META )
Dj1<- wes_palette("Darjeeling1")
Dj<- wes_palette("Darjeeling2")
Cv <- wes_palette("Cavalcanti1")
Gb<- wes_palette("GrandBudapest1")
Gb2<- wes_palette("GrandBudapest2")
Br<- wes_palette("BottleRocket2")
Rm<- wes_palette("Rushmore1")
R2<-wes_palette("Royal2")
mr3<- wes_palette("Moonrise3")
wpcolor<- c(Gb[2:4], Dj1[1:4], "forestgreen", Cv[1:3], Dj[1:4], R2[3],Br[1:4], Rm[2:4], Gb2,mr3, "white" )
ps17ra <- transform_sample_counts(finalps, function(OTU) 100* OTU/sum(OTU))
ps17ra<-subset_samples(ps17ra, filter == "10")
taxabarplot<-plot_bar(ps17ra, fill = "D3") + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=wpcolor) + theme(legend.title=element_blank()) + geom_bar(aes( fill=D3), stat="identity", position="stack") +theme_bw() + theme(text = element_text(size=24))+ coord_flip()
tbp17<-taxabarplot + theme(legend.position="none")
tbp17
#ggsave("tbp17.pdf", width = 8, height = 5)
legend <- g_legend(taxabarplot)
plegend <- grid.arrange(legend)
ps17raAsonly<- subset_taxa(ps17ra, D3 == "Acantharea")
otu_table(ps17raAsonly)
## OTU Table: [1 taxa and 4 samples]
## taxa are columns
## 70ed1b62b4a0153b667f7d1077f2ed08
## 17-Bottom-10 4.964011
## 17-Mid-10 5.223983
## 17-SCM-10 2.994037
## 17-Surface-10 3.307148
ps3 <- subset_samples(mergedps, Station == "3")
table(tax_table(ps3)[, "D2"], exclude = NULL)
##
## Alveolata_X Amoebozoa_X Apicomplexa
## 37 1 51
## Apusomonadidae Centroheliozoa Cercozoa
## 8 49 474
## Chlorophyta Choanoflagellida Ciliophora
## 210 113 1376
## Conosa Cryptophyta Dinoflagellata
## 2 43 10967
## Discoba Eukaryota_XX Foraminifera
## 6 1 4
## Fungi Haptophyta Hilomonadea
## 221 560 3
## Katablepharidophyta Lobosa Mesomycetozoa
## 9 12 31
## Metamonada Metazoa Ochrophyta
## 1 1491 828
## Opisthokonta_X Perkinsea Picozoa
## 25 91 124
## Radiolaria Rhodophyta Stramenopiles_X
## 3335 9 1196
## Streptophyta Telonemia <NA>
## 46 90 1701
ps3<- subset_taxa(ps3, D3 != "")
ps3<- subset_taxa(ps3, D2 != "Metazoa")
sample_data(ps3)
## Station Depth m filter desc
## 3-Bottom-10 3 Bottom 2300 10 3-Bottom-10
## 3-Bottom-2 3 Bottom 2300 2 3-Bottom-2
## 3-Mid-10 3 Mid 1000 10 3-Mid-10
## 3-Mid-2 3 Mid 1000 2 3-Mid-2
## 3-SCM-10 3 SCM 72 10 3-SCM-10
## 3-SCM-2 3 SCM 72 2 3-SCM-2
## 3-Surface-10 3 Surface 0 10 3-Surface-10
## 3-Surface-2 3 Surface 0 2 3-Surface-2
Again need to glom all taxa to D2, but keep acantharea separate:
ps3noA <- subset_taxa(ps3, D3 !="Acantharea")
ps3Aonly<- subset_taxa(ps3, D3 =="Acantharea")
ps3noA = tax_glom(ps3noA, "D2")
taxx <- as.data.frame(phyloseq::tax_table(ps3noA), stringsAsFactors = F)
taxx$D3 <- taxx$D2
taxxA <- as.data.frame(phyloseq::tax_table(ps3Aonly), stringsAsFactors = F)
taxx2<- rbind(taxx, taxxA)
otus <- as.data.frame(phyloseq::otu_table(ps3noA), stringsAsFactors = F)
otusA <- as.data.frame(phyloseq::otu_table(ps3Aonly), stringsAsFactors = F)
otusF <- cbind(otus, otusA)
OTUFinal <-otu_table(otusF, taxa_are_rows=FALSE)
taxxm <- as.matrix(taxx2)
TAXA <- tax_table(taxxm)
finalps<- phyloseq(OTUFinal, TAXA, META )
ps3ra <- transform_sample_counts(finalps, function(OTU) 100* OTU/sum(OTU))
ps3ra<-subset_samples(ps3ra, filter == "10")
taxabarplot<-plot_bar(ps3ra, fill = "D3") + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=wpcolor) + theme(legend.title=element_blank()) + geom_bar(aes( fill=D3), stat="identity", position="stack") +theme_bw() + theme(text = element_text(size=24))+ coord_flip()
tbp3<-taxabarplot + theme(legend.position="none")
tbp3
ggsave("tbp3.pdf", width = 8, height = 5)
ps3raAsonly<- subset_taxa(ps3ra, D3 == "Acantharea")
#otu_table(ps3raAsonly)
Load additional packages:
library(readr)
library(reshape2)
library("plyr", lib.loc="/Library/Frameworks/R.framework/Versions/3.4/Resources/library")
Load and format CTD data:
ctd17 <- read.csv("~/Desktop/MiraiDPI/DPI_CTD_data/6KCDT0006/CTD/6KCDT0006_CTD_data(1sec_int).csv")
ctd17 <- ctd17[,1:4]
names(ctd17) <- c("Time", "Temp", "Salinity", "Depth")
ctd17$Depth <- ctd17$Depth * -1
ctd17$station <- 17
ctd17$Time <- gsub(":", "", ctd17$Time)
ctd17$Time2 <- paste0('201706090', ctd17$Time)
st17aCTD <- ctd17[,c(4,6)]
names(st17aCTD)<-c("Depth","photos")
str(st17aCTD)
## 'data.frame': 9576 obs. of 2 variables:
## $ Depth : num -12.6 -12.9 -13.1 -13.3 -13.3 ...
## $ photos: chr "20170609085210" "20170609085211" "20170609085212" "20170609085213" ...
Load and format image data:
st17as<- read.table("~/Desktop/MiraiDPI/st17aRound.txt", header = TRUE)
st17as$photos <- strtrim(st17as$photos, 14)
st17as$photos <- as.numeric(st17as$photos)
Add depth from CTD data to images:
st17acanth <- merge(st17aCTD,st17as, by = "photos" )
Plot histogram of images by depth:
Ahist<- ggplot(st17acanth, aes(st17acanth$Depth)) + geom_histogram(breaks=seq(-800,0,by=20), col="black") +coord_flip() + theme_bw() +ggtitle("Station 17")
Ahist
Calculate the sampling volume (in order to get cells/L): Load names of all images from station 17 downcast:
st17pics<- read.table("~/desktop/MiraiDPI/st17DPI/ALLnames.txt", header = TRUE)
st17pics$photos <- strtrim(st17pics$photos, 14)
st17pics$photos <- as.numeric(st17pics$photos)
## Warning: NAs introduced by coercion
st17pics <- merge(st17aCTD,st17pics, by = "photos" )
Histogram of images:
pichist<- ggplot(st17pics, aes(st17pics$Depth)) + geom_histogram(breaks=seq(-800,0,by=20), col="black") +coord_flip() + theme_bw()
pichist
More images were taken at the surface and the bottome –> the camera spent more time at the surface and the bottom.
Get dataframe of the number of photos by depth at 20m intervals:
br = seq( -800, 0, by = 20)
ranges = br[-1]
freq = hist(st17pics$Depth, breaks=br, include.lowest=TRUE, plot=FALSE)
P<- data.frame(depth = ranges, photofrequency = freq$counts)
str(P)
## 'data.frame': 40 obs. of 2 variables:
## $ depth : num -780 -760 -740 -720 -700 -680 -660 -640 -620 -600 ...
## $ photofrequency: int 0 241 51 52 33 51 51 51 52 110 ...
How many photos total?
sum(P$photofrequency)
## [1] 2453
Get same data frame for Acantharian frequency:
br = seq( -800, 0, by = 20)
ranges = br[-1]
freq = hist(st17acanth$Depth, breaks=br, include.lowest=TRUE, plot=FALSE)
A<- data.frame(depth = ranges, Acanthfrequency = freq$counts)
str(A)
## 'data.frame': 40 obs. of 2 variables:
## $ depth : num -780 -760 -740 -720 -700 -680 -660 -640 -620 -600 ...
## $ Acanthfrequency: int 0 2 0 0 0 1 0 1 2 0 ...
Divide the acantharian frequency by photo frequency and convert to cells/L.
frequencyDF <- merge(A, P, by = "depth")[-1,]
frequencyDF$conc <- frequencyDF$Acanthfrequency / frequencyDF$photofrequency
frequencyDF$perLiter <- frequencyDF$conc / 0.4
Plot acantharian cells/L by depth:
library("wesanderson")
Dj1<- wes_palette("Darjeeling1")
perLplotSt17 <- ggplot(frequencyDF, aes(x=depth, y= perLiter)) + geom_bar(stat = "identity") + coord_flip() + scale_y_continuous(limits= c(0,1.25), expand = c(0, 0)) + scale_x_continuous(limits= c(-1000,0), expand = c(0, 0)) +theme_bw() + theme(text=element_text(size=28)) +xlab("Depth (m)") + ylab("")+ geom_vline(xintercept = -100 , color = "#93DB70", linetype = 5, size = 1)+ geom_vline(xintercept = -700 , color = "#005b96", linetype = 5, size = 1) +geom_vline(xintercept = -770 , color = "#03396c", linetype = 5, size = 1) + geom_vline(xintercept = -5 , color = Dj1[3], linetype = 5, size = 1)
perLplotSt17
#ggsave("perLplotSt17.pdf")
Load and format CTD data:
ctd3 <- read.csv("~/Desktop/MiraiDPI/DPI_CTD_data/6KCDT0003/CTD/6KCDT0003_CTD_data(1sec_int).csv")
ctd3 <- ctd3[,1:4]
names(ctd3) <- c("Time", "Temp", "Salinity", "Depth")
ctd3$Depth <- ctd3$Depth * -1
ctd3$station <- 3
ctd3$Time <- gsub(":", "", ctd3$Time)
ctd3$Time2 <- paste0('201705310', ctd3$Time)
st3aCTD <- ctd3[,c(4,6)]
names(st3aCTD)<-c("Depth","photos")
str(st3aCTD)
## 'data.frame': 6584 obs. of 2 variables:
## $ Depth : num -7.39 -7.28 -7.21 -7.22 -7.26 ...
## $ photos: chr "20170531074319" "20170531074320" "20170531074321" "20170531074322" ...
Load and format image data:
st3as<- read.table("~/Desktop/MiraiDPI/st3DPI/st3aRound.txt", header = TRUE)
st3as$photos <- strtrim(st3as$photos, 14)
st3as$photos <- as.numeric(st3as$photos)
Add depth from CTD to images:
st3acanth <- merge(st3aCTD,st3as, by = "photos" )
Plot histogram: image frequency by depth
Ahist3<- ggplot(st3acanth, aes(st3acanth$Depth)) + geom_histogram(breaks=seq(-1000,0,by=25), col="black") +coord_flip() + theme_bw()
Ahist3
Calculate the sampling volume (in order to get cells/L): Load names of all images from station 3 downcast:
st3pics<- read.table("~/desktop/MiraiDPI/st3DPI/ALLnames.txt", header = TRUE)
st3pics$photos <- strtrim(st3pics$photos, 14)
st3pics$photos <- as.numeric(st3pics$photos)
st3pics <- merge(st3aCTD,st3pics, by = "photos" )
st3pics <- st3pics[,-3]
names(st3pics)<- c("photos", "Depth")
st3pics$photos <- as.character(st3pics$photos)
Histogram of image frequency:
pichist<- ggplot(st3pics, aes(st3pics$Depth)) + geom_histogram(breaks=seq(-1000,0,by=25), col="black") +coord_flip() + theme_bw()
pichist
Image frequency is more even, and just plain more, than station 17.
Get data frame of image frequency by depth in 20m intervals:
br = seq( -1020, 0, by = 20)
ranges = br[-1]
freq = hist(st3pics$Depth, breaks=br, include.lowest=TRUE, plot=FALSE)
P3<- data.frame(depth = ranges, photofrequency = freq$counts)
str(P3)
## 'data.frame': 51 obs. of 2 variables:
## $ depth : num -1000 -980 -960 -940 -920 -900 -880 -860 -840 -820 ...
## $ photofrequency: int 109 77 64 71 77 79 76 75 77 77 ...
How many photos total?
sum(P3$photofrequency)
## [1] 4010
Get same data frame for Acantharian frequency:
br = seq( -1020, 0, by = 20)
ranges = br[-1]
freq = hist(st3acanth$Depth, breaks=br, include.lowest=TRUE, plot=FALSE)
A3<- data.frame(depth = ranges, Acanthfrequency = freq$counts)
str(A3)
## 'data.frame': 51 obs. of 2 variables:
## $ depth : num -1000 -980 -960 -940 -920 -900 -880 -860 -840 -820 ...
## $ Acanthfrequency: int 0 0 0 0 0 0 0 2 0 0 ...
Divide the acantharian frequency by photo frequency and convert to cells/L. And, plot cells/L by depth:
frequencyDF <- merge(A3, P3, by = "depth")[-1,]
frequencyDF$conc <- frequencyDF$Acanthfrequency / frequencyDF$photofrequency
frequencyDF$perLiter <- frequencyDF$conc / 0.4
perLplotSt3 <- ggplot(frequencyDF, aes(x=depth, y= perLiter)) + geom_bar(stat = "identity") + coord_flip() + scale_y_continuous(limits= c(0,1.25), expand = c(0, 0)) +theme_bw() + theme(text=element_text(size=24)) +xlab("Depth (m)") + ylab("Acantharian Cells per Liter") + geom_vline(xintercept = -72 , color = "#93DB70", linetype = 5, size = 1)+ geom_vline(xintercept = -995 , color = "#005b96", linetype = 5, size = 1) +ggtitle("")+ scale_x_continuous(limits= c(-1000,0), expand = c(0, 0)) + geom_vline(xintercept = -5 , color = Dj1[3], linetype = 5, size = 1)
perLplotSt3
#ggsave("perLplotSt3.pdf")
cropped all acantharian photos to touch the edges of the acantharian spines. Then got image size in pixels with: file ./* > imagesize.csv
Load and format data:
imagesize <- read.csv("~/Desktop/MiraiDPI/st17DPI/croppedacanths/imagesize.csv", header = FALSE)
imagesize$V1 <- substring(imagesize$V1, 3, 16)
split<- do.call(rbind, strsplit(as.character(imagesize$V2), " "))
imagesize$height <- as.numeric(split[,2])
imagesize$width <- as.numeric(split[,4])
imagesize<- imagesize[,c(1,5,6)]
names(imagesize) <- c("photos", "height", "width")
imagesizedepth<- merge(st17aCTD,imagesize, by = "photos" )
imagesizedepth$height <- (imagesizedepth$height * 21)/1000 #21um per pixel /1000 to convert um to mm
imagesizedepth$width <- (imagesizedepth$width * 21)/1000 #21um per pixel /1000 to convert um to mm
imagesizedepth$size <- imagesizedepth$height * imagesizedepth$width
str(imagesizedepth)
## 'data.frame': 246 obs. of 5 variables:
## $ photos: chr "20170609085210" "20170609085215" "20170609085215" "20170609085220" ...
## $ Depth : num -12.6 -13.2 -13.2 -13.1 -13.4 ...
## $ height: num 0.567 0.63 0.651 0.714 0.504 ...
## $ width : num 0.63 0.672 0.609 0.651 0.483 ...
## $ size : num 0.357 0.423 0.396 0.465 0.243 ...
Subset sizes to depth ranges:
library("dplyr")
surface <- filter(imagesizedepth, Depth >= -50)
surface$desc <- "Surface"
scm<- filter(imagesizedepth, Depth < -50 & Depth >= -100)
scm$desc <- "DCM"
meso<- filter(imagesizedepth, Depth < -100 & Depth >= -500)
meso$desc <- "Mid"
deep<- filter(imagesizedepth, Depth < -700)
deep$desc<- "Deep"
newdf17 <- rbind(surface, scm, meso, deep)
Plot size distribution by depth:
pcoacolor<- c("#93DB70" ,"#005b96", Dj1[3],"#03396c")
p17 <- ggplot(newdf17, aes(x=desc, y= size)) + geom_violin(aes(fill=desc)) +geom_boxplot(width=0.1) + theme_bw() + theme(text=element_text(size=30)) + scale_fill_manual(values=pcoacolor) +xlab("") + ylab("") +scale_x_discrete(limits = c("Deep", "Mid", "DCM", "Surface")) +scale_y_continuous(limits= c(0,9), expand = c(0, 0))
p17size<-p17 + coord_flip() + theme(legend.position="none")
p17size
#ggsave("p17size.pdf")
Load and format data:
imagesize <- read.csv("~/Desktop/MiraiDPI/st3DPI/croppedacanths/imagesize.csv", header = FALSE)
imagesize$V1 <- substring(imagesize$V1, 3, 16)
split<- do.call(rbind, strsplit(as.character(imagesize$V2), " "))
imagesize$height <- as.numeric(split[,2])
imagesize$width <- as.numeric(split[,4])
imagesize<- imagesize[,c(1,5,6)]
names(imagesize) <- c("photos", "height", "width")
imagesizedepth<- merge(st3aCTD,imagesize, by = "photos" )
imagesizedepth$height <- (imagesizedepth$height * 21)/1000 #21um per pixel /1000 to convert um to mm
imagesizedepth$width <- (imagesizedepth$width * 21)/1000 #21um per pixel /1000 to convert um to mm
imagesizedepth$size <- imagesizedepth$height * imagesizedepth$width
str(imagesizedepth)
## 'data.frame': 132 obs. of 5 variables:
## $ photos: chr "20170531074321" "20170531074322" "20170531074340" "20170531074342" ...
## $ Depth : num -7.21 -7.22 -7.09 -7.24 -7.24 ...
## $ height: num 1.092 0.735 0.42 0.777 0.483 ...
## $ width : num 1.008 0.441 0.462 0.672 0.399 ...
## $ size : num 1.101 0.324 0.194 0.522 0.193 ...
Subset sizes to depth ranges:
surface <- filter(imagesizedepth, Depth >= -50)
surface$desc <- "Surface"
scm<- filter(imagesizedepth, Depth < -50 & Depth >= -100)
scm$desc <- "DCM"
meso<- filter(imagesizedepth, Depth < -100 & Depth >= -500)
meso$desc <- "Mid"
deep<- filter(imagesizedepth, Depth < -700)
deep$desc<- "Deep"
newdf3 <- rbind(surface, scm, meso, deep)
Plot size distribution by depth:
pcoacolor<- c("#93DB70","#03396c", "#005b96", Dj1[3])
p3 <- ggplot(newdf3, aes(x=desc, y= size)) + geom_violin(aes(fill=desc)) + geom_boxplot(width=0.1) + theme_bw() + theme(text=element_text(size=30)) + scale_fill_manual(values=pcoacolor) +xlab("") + ylab("Image Area (mm^2)")+ scale_x_discrete(limits = c("Deep", "Mid", "DCM", "Surface")) +scale_y_continuous(limits= c(0,9), expand = c(0, 0))
p3size<- p3 + coord_flip() + theme(legend.position="none")
p3size
#ggsave("p3size.pdf")